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Abstract. This paper proposes the first implementation of an atomic storage tol¬ 
erant to mobile Byzantine agents. Our implementation is designed for the round- 
based synchronous model where the set of Byzantine nodes changes from round 
to round. In this model we explore the feasibility of multi-writer multi-reader 
atomic register prone to various mobile Byzantine behaviors. We prove upper 
and lower bounds for solving the atomic storage in all the explored models. Our 
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mobile Byzantine failures. 
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1 Introduction 

Byzantine-tolerant storage is an active research area and this problem has been studied 
in various settings and models (e.g.[3,15,10,11] to cite just few of them). Recently, 
several works investigate this problem in the case where the system starts in an arbitrary 
state. To cope with this situation stabilizing Byzantine tolerant algorithms have been 
proposed in [1,5,7]. In all the above mentioned works the set of Byzantine processes is 
assumed to be static. That is, the set of nodes exhibiting a Byzantine behavior does not 
change during the computation. 

In the current work we investigate a different fault model where Byzantines are 
mobile. This model captures insiders attacks or viruses propagation. In the mobile 
Byzantine fault model transient state corruptions, which can be abstracted as Byzan¬ 
tine “agents,” can move through the network and corrupt the nodes they occupy. A node 
occupied by a Byzantine agent will behave arbitrarily for a transient period of time. 
Once the Byzantine agent leaves the node, the node eventually behaves correctly. How¬ 
ever, the Byzantine agent may ’’infect” another node that behaved correctly until the 


infection. This models the situation where, as soon as a faulty node is repaired, another 
one becomes compromised. 

There are two main research directions in the mobile Byzantine area: Byzantines 
with constrained mobility and Byzantines with unconstrained mobility. In both models 
the only distributed problem studied so far is the agreement problem. Byzantines with 
constraint mobility were studied by Buhrman et al. [6]. They consider that Byzantine 
agents move from one node to another only when protocol messages are sent (similar 
to how viruses would propagate). 

In the case of unconstrained mobility the motion of Byzantine agents is not tight to 
the message exchange. Several authors investigated the agreement problem in variants 
of this model: [2,4,8,12,13,14]. Reischuk [13] investigate the stability/stationarity of 
malicious agents for a given period of time. Ostrovsky and Yung [12] introduced the 
notion of mobile virus and investigate an adversary that can inject and distribute faults. 

Our work follows the lines opened by Garay [8], Garay [8] and, more recently, 
Banu et al. [2] and Sasaki et al. [14] or Bonnet et al. [4] consider, in theirs models, 
that processes execute synchronous rounds composed of three phases: send, receive, 
compute. Between two consecutive rounds, Byzantine agents can move from one host 
to another, hence the set of faulty processes has a bounded size although its membership 
can change from one round to the next. 

In the current work we focus four of the above discussed models, all four consider a 
synchronous round-based system : Garay [8], Buhrman et al. [6], Sasaki et al. [14] and 
Bonnet et al. [4]. In the Garay’s model a process has the ability to detect its own infec¬ 
tion after the Byzantine agent left it. More precisely, during the first round following the 
leave of the Byzantine agent, a process enters a state, called cured, during which it can 
take preventive actions to avoid sending messages that are based on a corrupted state. 
Garay [8] proposes in this model an algorithm that solves Mobile Byzantine Agreement 
provided that n > 6t (dropped later to n > 4/ in [2]). 

Buhrman et al. [6] propose a model where the motion of Byzantine agents is tight 
to the message exchange. In this model they prove a tight bound for Mobile Byzantine 
Agreement (n > 3/, where t is the maximal number of simultaneously faulty processes) 
and propose a time optimal protocol that matches this bound. 

Bonnet et al. [4] investigated the same problem in a model where processes do 
not have the ability to detect when Byzantine agents move. However, differently from 
Sasaki et al. [14], cured processes have control on the messages they send. This subtle 
difference on the power of Byzantine agents has an impact on the bounds for solving 
the agreement. If in the Sasaki’s model the bound on solving agreement is n > 6/ in 
Bonnet’s model it is n > 5/ and this bound is proven tight. 

Our contribution. As far as we known, our construction is the first that builds a dis¬ 
tributed MWMR atomic memory on top of synchronous round-based servers, which 
communicate by message-passing, and where some of them can exhibit a Byzantine 
behavior induced by a mobile malicious agent. We prove first upper bounds on the 
number of faulty processes for four of the mobile Byzantine models cited above: Garay 
[8], Buhrman et al. [6], Sasaki et al. [14] and Bonnet et al. [4]. Then, we propose tight 
implementations of a atomic register in each of these models altogether with their cor¬ 
rectness proofs. The first study focuses the model of Garay et al. [8], where nodes can 



detect that they were previously infected by a Byzantine agent and remain silent until 
their state is cleaned. In this model, we implement the atomic register provided that in 
each round the number of Byzantine nodes (nodes occupied by a Byzantine agent), /, 
is less than n /3 where n is the number of correct nodes in that round. The second study 
concerns the models of Sasaki et al. [14] and Bonnet et al. [4], where infected nodes 
cannot locally detect the presence or the absence of a Byzantine agent and hence can 
send/compute based on a corrupted state even thought the mobile agent is not anymore 
located at that node. In both these models we implement the atomic register provided 
that in each round the number of Byzantine nodes / is less than nj4 where n is the num¬ 
ber of correct nodes in the round. Note that differently from the case of the agreement 
problem, these models have the same power in the case of atomic memory implemen¬ 
tation. The last studied model is Buhrman et al. [6] where Byzantine agents move with 
the messages. In this model, we provide an implementation of the atomic memory pro¬ 
vided that / is less than n/2. Note that all the above bounds are also lower bounds for 
the considered models. 

Paper roadmap. The paper is organized as follows. In Section 2 we define the model 
of the system and the problem of MWMR atomic memory. In Section 3 we prove upper 
bounds on the faulty processes necessary to implement MWMR atomic memory in 
the following four mobile Byzantine models: Garay [8], Buhrman et al. [6], Sasaki et 
al. [14] and Bonnet et al. [4], In Section 4 we present a generic tight algorithm that 
implements MWMR atomic memory parametrized function on the considered mobile 
Byzantine model. The correctness of the generic algorithm is proved in Section 4.2. 
Finally, Section 5 concludes the paper and discuss some open research directions. 

2 Model and Problem Definition 

2.1 System Model 

We consider a distributed system composed of an arbitrary large set of clients C and a 
set of n servers S = {si, S2 • • ■ Sn}. Each process in the distributed system (i.e., both 
servers and clients) is identified trough a unique integer identifier. Servers run a dis¬ 
tributed protocol implementing a shared memory abstraction. 

Communication model and timing assumptions. Processes communicate trough mes¬ 
sage passing. In particular, we assume that (i) each client c, £ C can communicate 
with every server trough a broadcast primitive, (ii) servers can communicate among 
them trough a broadcast primitive and (iii) servers can communicate with clients trough 
point-to-point channels. We assume that communications are authenticated (i.e., given 
a message to, the identity of its sender cannot be forged) and reliable (i.e. messages are 
not created, lost or duplicated). 

The system evolves in synchronous rounds. Every round is divided in three phases: 
(i) send where processes send all the messages for the current round, (ii) receive where 
processes receive all the messages sent at the beginning of the current round and (iii) 
computation where processes process received messages and prepare those that will be 
sent in the next round. Processes have access to the current round number via a local 



variable that we usually denote by r. 


Failure model. We assume that an arbitrary number of clients may crash while servers 
are affected by mobile Byzantine failures (MBF) [4,8,6,14], Informally, in the mobile 
Byzantine failure model, faults are represented by powerful computationally unbounded 
agents that move arbitrarily from a server to another. When the agent is on the server, it 
can corrupt its local variables, force it to send arbitrary messages (potentially different 
from process to process) etc... However, the agent cannot corrupt the identity of the 
server. We assume that, in each round, at most f servers can be affected by a mobile 
Byzantine failure. When an agent occupies a server Sj we will say that s, is faulty. When 
the agent leaves s l it is said to be cured until it does not restore the correct internal state. 
If a server is neither faulty nor cured then it is said to be correct. We assume similar to 

[4.8.14] that each server has a tamper-proof memory where it safely stores the correct 
algorithm code. When the agent leaves a server .s,; (i.e., it becomes cured), it recovers 
the correct algorithm code from the tamper-proof memory. Concerning the assumptions 
on agent movements and the server awareness on its cured state, different models have 
been defined. In the paper we will consider all the variants of mobile Byzantine failures 

[4.8.6.14] : 

- (Ml) Garay’s model [8]. In this model, agents can move arbitrarily from a server 
to another at the beginning of each round (i.e. before the send phase starts). When a 
server is in the cured state it is aware of its condition and thus can remain silent to 
prevent the dissemination of wrong information until its code has been completely 
restored and its state is corrected. 

- (M2) Bonnet et al.’s model [4] and (M3) Sasaki et al.’s model [14]. As in the previ¬ 
ous model, agents can move arbitrarily from a server to another at the beginning of 
each round (i.e. before the send phase starts). Differently from the Garay’s model, 
in both models it is assumed that servers do not know if they are correct or cured 
when the Byzantine agent moved. The main difference between these two models 
is that in the [14] model a cured process still acts as a Byzantine one extra round. 

- (M4) Buhrman’s model [6]. Differently from the previous models, agents move 
together with the message (i.e., with the send or broadcast operation). However, 
when a server is in the cured state it is aware of that. 

2.2 Atomic Registers 

A register is a shared variable accessed by a set of processes, i.e. clients, through two 
operations, namely read() and write(). Informally, the writeQ operation updates the 
value stored in the shared variable while the read() obtains the value contained in the 
variable (i.e. the last written value). Every operation issued on a register is, generally, 
not instantaneous and it can be characterized by two events occurring at its boundary: 
an invocation event and a reply event. These events occur at two time instants (invoca¬ 
tion time and reply time) according to the fictional global time. 

An operation op is complete if both the invocation event and the reply event occur (i.e. 
the process executing the operation does not crash between the invocation and the re¬ 
ply). Contrary, an operation op is said to be failed if it is invoked by a process that 



crashes before the reply event occurs. According to these time instants, it is possible to 
state when two operations are concurrent with respect to the real time execution. For 
ease of presentation we assume the existence of a fictional global clock and the invoca¬ 
tion time and response time of every operation are defined with respect to this fictional 
clock. 

Given two operations op and op', and their invocation event and reply event times 
(tn(op) and t E (op' )) and return times (t E {op) and t E {op')), we say that op precedes 
op' (op -< op') iff t E (op) < tsiop'). If op does not precede op' and op' does not pre¬ 
cede op, then op and op' are concurrent (op\\op'). Given a write(u) operation, the value 
v is said to be written when the operation is complete. 

We assume that locally any client never performs readQ and write() operation concur¬ 
rently. We also assume that initially the register stores a default value _L written by a 
fictional write(_L) operation happening instantaneously at round r fj . In case of concur¬ 
rency while accessing the shared variable, the meaning of last written value becomes 
ambiguous. Depending on the semantics of the operations, three types of register have 
been defined by Lamport [9]: safe, regular and atomic. In this paper, we will consider 
a Multi-Writer/Multi-Reader (MWMR) atomic register which is specified as follows: 

- Termination: If a correct client invokes an operation, it eventually returns from that 
operation. 

- Validity: A read operation returns the last value written before its invocation, or a 
value written by a write operation concurrent with it. 

- Ordering: There exists a total order S of readQ and writeQ operations such (i) if 
op -< op' then op appears before op' in S and fii) any readQ operation returns the 
value v written by the last writeQ preceding it in S. 

3 Upper Bounds on the number of Faults 

The next theorems provide upper bounds on the number of faulty processes for the 
implementation of MWMR Atomic Register in the models of mobile Byzantine faults 
[4,8,6,14], 

Theorem 1. If n < 3/, there exists no algorithm that implements a MWMR Atomic 
Register in the Garay’s model [8], 

Proof Consider that each readQ operation takes at least one round to be executed and, 
according to the Garay’s model, at the beginning of each round servers are partitioned 
in three sets: (i) faulty, (ii) cured and (iii) correct. Due to the assumption that we have / 
faulty servers in each round, we have that, cured processes, in the worse case, are f as 
well (i.e., the / servers that were faulty in the previous round). Thus, considering that 
n is at most 3/, we follows that, in the worst case, at most / processes are correct. As a 
consequence, considering that cured servers are silent (they do not send any message), 
the reader will gather at most 2/ values and it will be not able to distinguish those that 
come from correct servers from those coming from faulty one. LI Theorem l 

Theorem 2. If n < 4/, there exists no algorithm that implements a MWMR Atomic 
Register in the Sasaki’s model [14], 



Proof The claim simply follows by considering that each read () operation takes at least 
one round to be executed and, according to the Sasaki’s model, at the beginning of each 
round servers are partitioned in three sets: (i) faulty, (ii) cured and (iii) correct. Due to 
the assumption that at most / faulty servers are in each round, it follows that, cured 
processes, in the worst case, are / (i.e., the / servers that was faulty in the previous 
round). Thus, considering that n is at most 4/, we have that, in the worst case, at most 
2/ processes are correct. As a consequence, considering that cured servers act like 
faulty ones as well, the reader will get back at most 4/ values and it will be not able to 
distinguish which ones come from correct servers (i.e., 2/ same values v) from those 
coming from faulty one (i.e., 2/ same values v'). 0 Theorem 2 


Theorem 3. If n < 4/, there exists no algorithm that implements a MWMR Atomic 
Register in the Bonnet’s model [4], 

Proof The claim simply follows by considering that the Bonnet’s model is a particular 
case of Sasaki model, in which cured servers act as less powerful faulty servers, forced 
to send the same message to all. The same reasoning as in the proof of Theorem 2 is 

applied. LI Theorem 3 

Theorem 4. If n < 2/ there exists no algorithm that implements a MWMR Atomic 
Register in the Burhman ’s model [6]. 

Proof The proof is similar to the static case [3], Let us suppose by contradiction that 
such algorithm exists and suppose without restraining the generality that n = 2/. Let 
v be the value written by the last completed write() operation and let us assume that 
no other operations are concurrent with the read(). In this settings, when the client gets 
values from servers, it will receive at most / same value v from correct servers and / 
same values v', with v' / v from faulty servers. As a consequence, the reader has no 
way to distinguish between the two values and we have a contradiction. 

L Th eorem 4 


4 Tight MWMR Atomic Register Implementation 

In this section we present a generic algorithm AAreg (Fig .2-1) that implements the 
MWMR Atomic Register in all the above presented models. In order to abstract the 
knowledge a server has on its state (i.e. cured or correct ), we introduce the cured_state 
oracle. When invoked via report_cured_state() function it returns true to cured servers 
and false to others in the Garay [8] and Buhrman et al. [6], In this case the oracle is said 
enabled. cured_state oracle returns always false in Sasaki et al. [14] or Bonnet et al. [4] 
models. In this case the oracle is said disabled. 

In the following we propose a generic MWMR atomic register algorithm that is 
tight for all the above models by just tuning the following three parameters: a , /3 and 
the cured_state oracle status. Let denote the number of servers with respect to faulty 
servers by n > af, where a £ {2,3,4} following the mobile Byzantine model. Let s 



be the minimal number of required occurrences of the same value in order to chose it, 
s = n — [if. Basically s has to be greater than the number of possible wrong values 
that faulty and cured servers can return, which is [if, where /3 £ {1, 2} depending on 
the model adopted for the cured servers. 

Table 1 summarizes the above in a synthetic way. 


Table 1. AAreg parameters for the four different Mobile Byzantine Failure models. 


Failure model 

M id 

a 

P 

Oracle 

Garay [8] 

Ml 

J 

2 

enabled 

Bonnet et al. [4] 

M2 

4 

2 

disabled 

Sasaki et al. [14] 

M3 

4 

2 

disabled 

Burhman et al. [6] 

M4 

2 

1 

enabled 


4.1 AAreg Algorithm description 

The presented algorithm exploits the round based nature of the system model. Any 
write() operation lasts one round, during which a client sends the value and all servers 
deliver it in the same round. Due to the synchrony assumptions no acknowledgement 
messages are required and the operation can terminate. If more than one writeQ oper¬ 
ation falls in the same round then any server receives the same set of values. The one 
coming from the client with the highest identifier is stored, thus any server chose the 
same value. The read() operation lasts two rounds. One round to send a read request to 
servers and the subsequent one to gather replies. The value which occurrence is at least 
the threshold n — [3f is returned. 

Along with the classical read() and write(u) operations performed by clients, for main¬ 
tenance purpose in each round servers echo each other their value. Thus even though 
at each round at most / servers may lose the value (and no write() operation occurs), 
thanks to the echoed values at the end of each round cured servers are able to became 
correct, having the same correct servers value. 

Client local variables. Each client Ci manages the following variables: 

— to-sendi'. a set in which are stored messages to be sent in the next send phase and 
emptied just after. 

— readingi and writingp. two boolean variables, only the one corresponding to the 
current operation is set to true. 

— op start p. a variable in which is stored the current round when a new operation starts 
and set to _L when it ends. 

— rcvi is a set variable (emptied at the beginning of each round), where c* stores mes¬ 
sages received during the current round r. 

— repliesp. a set in which are stored messages delivered after a read request. 


Serx’er local variables. Each server Sj manages the following variables: 
— valuep. the maintained value. 




At the beginning of each round r 

(01) echojualsi <— 0; 

(02) currentjwritesi <— 0; 

(03) curedi <— report_cured_state(); 


Send Phase of round r 

(04) if (-i curedi ) 

(05) then broadcast ECHO (val, i)\ % maintenance 

(06) for each j G currentjreadsi do 

(07) send REPLY (valuei ) to Cj\ % reply to read() operations started in round r — 1 

(08) endFor 

(09) endif 

(10) currentjreadsi <— 0; 


Receive Phase of round r 

(11) for each ECHO(t>, j) message in rcvi do 

(12) echojualsi <— echojvalsi U v; 

(13) endFor 

(14) for each write (v, j ) message in rcvi do 

(15) currentjwritesi <— currentjwritesi U < v, i >; 

(16) endFor 

(17) for each READ (j) message in rcvi do 

(18) currentjreadsi <— currentjreadsi U {j}; 


Computation Phase of round r 

(19) if ( currentjwritesi 7 ^ 0) 

(20) then let v such that 3 < v, j >G currentjwritesi A j — max*. (< —, k >); 

(21) valuei <— v; 

(22) else if (3u G echojualsi \ ^occurrence (v) > n — f3f) 

(23) then valuei <— v\ 

(24) endif 

(25) endif 


Fig. 1 . AAreg implementation: code executed by any server .s t . 


— rcvi is a set variable (emptied at the beginning of each round), where Sj stores mes¬ 
sages received during the current round r. 

— echojvalsj: a set (emptied at the beginning of each round), in which are stored the 
echoed values by servers in each round. 

— current-writes j: a set (emptied at the beginning of each round), in which are stored 
values that clients want to write during the current round. 

— currendjreadsj'. a set in which are stored the identifiers of clients whose requested 
for a read. It is emptied after the reply to such clients. 

— curedj: boolean variable set through the report_cured_state() event. It is set to true 
by the cured_state oracle (if enabled) when Sj is in a cured state. Otherwise it is always 
false. 

Server maintenance. For maintenance purposes, at the beginning of each round, servers 
exchange their stored value value j allowing cured servers to became correct at the end 
of it. Thus, during the send phase of each round, servers broadcast the ECHO{val : i) 
message (Fig.l, line 05). If not new values have been written in the current round (the 
condition at line 19 is not verified), during the computation phase (Fig.l, line 22) they 
chose the one with at least n — (3f occurrences. Note that in the case in which servers 
are aware of being in a cured state (Fig.l, line 04) then they avoid to send their value j. 




operation read(): 

(01) tosendi <— tosendi U { READ(i)}; 
( 02 ) readingi <— true; 


operation write(i;) 

(03) tosendi <— tosendi U { WRITER, i)}; 
(04) writingi <— true; 


Send Phase of round r 

(05) for each m() G tosendi do broadcast m(); 
(06) if (opstarti == _L) 

(07) then opstarti <— r; 

(08) endlf 

(09) to.sendi «— 0; 


Receive Phase of round r 

(10) for each REPLY(v, j) message in rcvi do 

(11) repliesi <— repliesi U < v, j >; 

(12) endFor 


Computation Phase of round r 

(13) if (writingi A opstarti = r ) 

(14) then writingi ■<— false; 

(15) opstarti <— _L; 

(16) return write.confirmation; 

(17) endif 

(18) if ( readingi A opstarti = r — 1) 

(19) then readingi <— false; 

(20) opstarti <— _L; 

(21) let v such that 3 < v,j >G repliesi A ^occurrences) > n — f3f; 

(22) repliesi <— 0; 

(23) return v; 

(24) endif 


Fig. 2. AAreg implementation: code executed by any client Ci. 


Write operation. When a client c* wants to write a value v, it stores in tosendi a mes¬ 
sage WRITE(v,i) and sets the variable writingi to true (Fig.2, line 03-04). At the 
subsequent send phase, c* broadcasts WRITE(v,i) to all servers, stores the current 
round in opstarti and empties the tosendi set (Fig.2, line 05-09). At the server side 
this message will be delivered within the same round during the receive phase and any 
correct and cured server Sj stores it in currentsuritesj set (Fig.l, line 14-15). At 
the end of the round, during the computation phase, if current-writesj is not empty 
then the value associated to the highest client identifier is stored in value j (Fig. 1, line 
19-21). 

Back to the client side, during its computation phase if writingi is true and opstarti 
is equal to the current round r, this means that during the current round c, performed a 
write() operation. Since it lasts just one round then it sets writingi to false, opstarti 
to _L and returns the write_conformation to the application layer (Fig. 2, line 13-17). 


Read operation. When a client c, wants to read at round r then it stores in tosendi 
a message READ{i) and sets the variable readingi to true (Fig.2, line 01-02). At 
the subsequent send phase ci broadcasts a READ(i ) message to all servers, stores the 



current round r in op_starti and empties the to_sendi set (Fig.2, line 05-09). Note, the 
check at line 06 is necessary to avoid that op_start l would be updated at each round. 
This would not be an issue for the writeQ operation which lasts only one round, but in 
the case of read() operation it would cause the loss of information about the starting 
round. At server side, the READ[i) message will be delivered within the same round 
r and any correct and cured server Sj stores the client identifier in the currentjreadsj 
set (Fig. 1, line 17-18). 

At the start of the next round r + 1, if server Sj is not cured or not aware of that 
then it sends the message RE P LY (value j) to all the clients in current -reads j set, 
which is emptied at the end of the send phase (Fig. 1, line 06-10). At client side all 
the REPLY [value j) are delivered and stored in the set replies , during the receive 
phase (Fig.2,line 10-12). Now during the computation phase the readingi variable is 
true and opstarti is storing the previous round number. Thus readingi is set to false, 
op_starti is set to _L and the value in replieSi which occurs more than n — f3f times is 
returned to the application layer and replieSi is emptied (Fig. 2, line 18-24). 

4.2 Correctness Proofs 

Lemma 1. Let otMi tind Pmi be the parameters for each of the A failure models Mi 
as reported in Table 1 and used by the algorithm in Fig. 1-2. Let n > otMif for each 
failure model Mi considered. At the end of each round, at least n — f correct servers 
store the same value v in their valuei local variable. 

Proof Each non-faulty server updates its valuei local variable at the end of each round 
r (i) in line 21 i.e., if there exists at least a pair in the current jwriteSi local variable, 
or (ii) in line 23 i.e., currentjwritesi is empty and there exist at least n — (3f same 
values in echojvalsi. 

First we prove that one of the two cases always happens and then we prove that the 
number of non-faulty servers storing the same values v is n — f. The currentjwritesi 
local variable is initialized by any non-faulty server s, to 0 at the beginning of each 
round r (cfr. line 02) and it is updated when a WRITE0 message is received by s* 1 . 
Thus, case (i) corresponds to a scenario where at least a write() operation is executed 
in round r and case (ii) corresponds to a scenario where no write() is running. 

- Case (i): current_writesi 7 ^ 0. In this case the claim simply follows by consid¬ 
ering that (i) writer clients broadcast a WRlTE(t), j) message in the send phase of 
round r, (ii) clients are correct so the same set of values is delivered to all servers 
that will apply a deterministic function to select the value v and (iii) at most / 
servers are faulty and may skip the update of their valuei variable. 

- Case (ii): current_writesi = 0 and line 22 is true. In this case, the valuei vari¬ 
able is updated according to the values stored in echojvalsi. Such variable is emp¬ 
tied by every non-faulty process at the beginning of each round (cfr. line 01 ) and is 

1 Recall that such WRITE () message is sent by the writer client in the send phase of the first 
round starting after the write () invocation and it is delivered by any non-faulty server in the 
same round. 




filled in when an ECHO() message is delivered. Such message is sent at least by any 
server, believing it is correct, at the beginning of each round. Let r' be the round 
in which the last write(u) operation terminated. Note that, due to above hypoth¬ 
esis, a write() operation always exists as we assume a fictional write happening 
instantaneously at round ro. Without loss of generality, let us consider the round 
r = r' + 1. Due to case (i), at the end of r' , at least n — f non-faulty servers store 
the same value v in their local variable valuei. Thus, at the beginning of r' + 1, at 
least n — / — x correct servers will send an ECHO(ri,j) message, where x is the 
number of non-faulty processes that become faulty while passing from r' to r (i.e. 
x = f for all the models but Burhman’s one where x = 0 as faulty processes move 
during the send phase and not at the beginning of the round). It follows that the 
condition in line 22 is verified if and only if n — f — x>n — (3f that is true in 
any model. Therefore, considering that at the end of round r non-faulty servers are 
exactly n — /, we have that n — f processes will execute this update. Iterating the 
reasoning for any r the claim follows. 


L Lemma 1 

Lemma 2. Let us consider the algorithm in Fig. 1 -2. If a correct client invokes a write () 
operation, it eventually returns from that operation. 

Proof The proof simply follows by considering that, for a write() operation invoked 
at some round r, the write_confirmation is generated by the client at the end of the 
same round just checking the value of the variables initialized at the beginning of r. 

LI Lemma 2 

Lemma 3. Let olmi and pMi be the parameters for each of the 4 failure models Mi 
as reported in Table 1 and used by the algorithm in Fig. 1-2. Let n > a.Mif for each 
failure model Mi considered. If a correct client invokes a read() operation, it eventually 
returns from that operation. 

Proof Let Cj be a client invoking a read() operation at some time t. When this happens, 
Cj flags that a read() operation is starting and prepares a READ0 message to send at the 
beginning of the next send phase at round r. When Cj sends such READ0 message, it 
updates its op_startj variable to r and it returns from the read() operation at round r-(-1 
if and only if it has at least n — f3f occurrences of the same value in the replies j set. 
Such replies j is initially empty (it has been emptied at the end of the previous readQ 
operation) and it is filled in when Cj receives a REPLY0 message (line 11) that is sent 
at least by non-faulty servers when they receive a READ0 message. 

In particular, the READ0 message sent by c :l will be delivered by servers during the 
receiving phase of round r. When this happens, any non-faulty server will execute line 
18 in Figure 1 and will store the identifier of Cj in order to send a reply at the beginning 
of the next round r + 1. Due to Lemma 1, at the end of round r, at least n — f non- 
faulty servers will store the same value v. Let us note that, during the send phase of 
round r + 1, x of such servers may become faulty. Thus, c :i will find a value satisfying 
the condition in line 21 if and only if n — / — x > n — /3f. Considering that x < f for 



all models but Burhman’s one where x = 0, we have that the condition is always true 
and the claim follows. 

L Lemma 3 

Theorem 5 (Termination). If a correct client invokes an operation, it eventually re¬ 
turns from that operation. 

Proof It follows direclty from Lemma 2 and Lemma 3. LI Theorem 5 

Theorem 6 (Validity). Let a.M% and (3 Mi be the parameters for each of the 4 failure 
models Mi as reported in Table 1 and used by the algorithm in Fig. 1-2. Let n > oiMi f 
for each failure model, Mi, considered. Any read() operation returns the last value 
written before its invocation, or a value written by a concurrent writeQ operation. 

Proof Without loss of generality, let us consider the first write(u) operation opw and 
the first read() operation opr. Three cases may happen: (i) opr -< opw , (ii) opw -< 
opr and (iii) opw || opr. Let us note that op r spans over two rounds: in the first one it 
sends the READ0 message and in the second one it collects replies. 

- Case (i): opr -< opw- This case follows directly from Lemma 1 considering that 
(i) at the end of the first round of op r (i.e., rf) at least n — f correct processes have 
the same initial value v = _L, (ii) while moving to the second round of opr, at most 
x processes can get faulty (with x < f for models M1-M3 and x = 0 for M4), (iii) 
n — f — x >n — f$Mif (i.e. hnf > / + x) for each model (i.e. there will always 
be enough replies from correct servers to select a value) and (iv) n — pMif > f 
(i.e. (a.Mi — pMi)f +1 > /) for each model. It follows that faulty processes cannot 
force the client to select a wrong value. 

- Case (ii): opw -< opr. Let r be the round at which opw terminates and let r + 1 
be the round at which opr is invoked. 

Due to Lemma 1, at round r + 2 there are enough occurrences (at least n — f3f) 
of the last written value v. So, applying the same reasoning of case (i) the claim 
follows. 

- Case (iii): opw || opr. Let us note that a read() operation spans two rounds, i.e., 
the round of the request r req and the round of the reply r rep iy. So, let us consider 
them separately. 

• Case (iii-a): opw is concurrent with opr during r req . In that case the value v 
is delivered to correct server at the end of r req . Due to Lemma 1, at the end of 
r req at least n — f correct servers store the new written value v, we fall down 
into case (ii) and the claim follows. 

• Case (iii-b): opw is concurrent with opr during r rep i a y Since, in every round, 
the send phase is executed before the receive phase, it follows that at least all 
the correct servers will reply with the value written before the invocation of the 
writeQ operation, we fall down into case (i) and the claim follows. 



'Theorem 6 


□ 


Theorem 7 (Ordering). There exists a total order S of read() and write() operations 
such (i) if op op’ then op appears before op' in S and (ii) any read() operation 
returns the value v written by the last write() preceding it in S. 

Proof Consider two read() operations, opp\ and opp2 returning respectively v\ and 
V‘2 (with vi V2) such that opp± -< opp 2 . Note that if op a 1 returns v\, it follows that 
there exists a write(ui) operation, opw( Vl ) concurrent or preceding it in S. Suppose by 
contradiction that opw(v 2 ) -< °Vw{v i)- Recall that each read() operation spans over two 
rounds and call the first r req and the second r rep i y . Since op p\ returns v\ this means 
that vi has been stored by servers at latest during r req of opp 1 ; let us call it rpi re q- The 
same holds for op p2'. V2 has been written at most during rp2 req of opp2- Since opp 2 
follows op pi then rpi req < rp>re. q - However, which is a contradiction to respect the 
assumption of r v 1 > r v 2 (a general scenario is depicted in Fig.3). □ Lemma 7 
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J *- L repty 

r R2 re q 

T" R“2 re ply 
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r vi 


rv 2 



OPw(v i) 

°PW(v 2 ) 


Fig. 3. A general scenario which show how two subsequent read () operations opp 1 and 
opp 2 can not return respectively v\ and V 2 if >'2 has been written before v\. 


Theorem 8. Let AAreg be the algorithm in Fig. 1-2 and let n > af. If a = 3 and 
(3 = 2 then AAreg implements a MWMR Atomic register in the Garay’s model. 

Proof It follows directly from Theorem 5, 6 and 7. □Theorem 8 

Theorem 9. Let AAreg be the algorithm in Fig. 1-2 and let n > af. If a = 4 and 
f3 = 2 then AAreg implements a MWMR Atomic register in the Bonnet’s model. 

Proof It follows directly from Theorem 5, 6 and 7. ^Theorem 9 

Theorem 10. Let AAreg be the algorithm in Fig. 1 -2 and let n > af. If a = 4 and 
/3 = 2 then AAreg implements a MWMR Atomic register in the Sasaki’s model. 

Proof It follows directly from Theorem 5, 6 and 7. □Theorem 10 

Theorem 11. Let AAreg be the algorithm in Fig. 1-2 and let n > af. If a = 2 and 
(3 = 1 then AAreg implements a MWMR Atomic register in the Burhman’s model. 

□ 


Proof It follows directly from Theorem 5, 6 and 7. 


'Theorem 11 











5 Conclusion 


This paper addressed the first implementation of a multi-writer multi-reader atomic reg¬ 
ister tolerant to mobile Byzantine agents altogether with upper bounds on the number 
of faulty processes. We investigate four models of mobile Byzantines in round-based 
synchronous systems: the model of Garay et al. [8], where nodes have the capability to 
detect an infection and clean their state after the Byzantine agent leaves the node; the 
models of Sasaki et al. [14] and Bonnet et al. [4], where infected nodes may execute 
their code with a corrupted state even though the mobile agent is not anymore located 
at the node and finally, the model of Buhrman et al. [6] where Byzantines move are 
tight to messages and move during the send phase. As for the case of the agreement 
problem (benchmark already investigated in all these models) our study shows that the 
atomic registers cannot be implemented using the static bounds on the number of faulty 
processes. That is, we prove that in the Garay’s model atomic registers can be imple¬ 
mented provided that in each round the number of Byzantine nodes (nodes occupied 
by a Byzantine agent), /, is less than n/3 where n is the number of correct nodes in 
that round while in the Bonnet’s and Sasaki’s models the number of Byzantine nodes 
/ is less than n/4. Finally, for the case of Buhrman’s model we show that / should be 
less than n/2. Our study can be extended in several directions (here after we mention 
only two of them). First, an interesting issue is to investigate the storage problem in the 
round-free synchronous and furthermore in the asynchronous settings. We conjecture 
that in these models the bounds on the faulty processes are different from the round- 
base case. Secondly, our study advocates in favor of revisiting other building blocks of 
distributed computing in these settings (e.g. quorums, k-set agreement, synchronization 
etc). In all these cases we conjecture lower and upper bounds different from the static 
case. 
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